-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi delimiter support #4
Conversation
a3f4349
to
2d85820
Compare
I just rebased onto master so the diff is now showing only the changes that are part of this PR (not the ones from #3) |
}, []); | ||
return new RegExp(tokenMatchers.join('|'), 'g'); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the core of the change. Now it uses the delimiters
array (which by default contains the values from startDelimiter
and endDelimiter
) to create the token regular expression.
It constructs a regular expression (as a string) for each element of the array, then joins them together with |
("or"). That becomes the full regular expression that is used to identify things to not pseudolocalize.
``` | ||
{ full: '%d' } | ||
``` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the documentation for the change
pseudoloc.option.endDelimiter = '\\)[sd]'; | ||
pseudoloc.str('A test string with a %(token)s.'); | ||
// [!!Á ţȇšŧ śťřīņğ ŵıţħ ą %(token)s.!!] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just some additional explanation about the way the options.delimiter*
options work -- it took us a bit of trial and error to figure it out so hopefully this will help others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good and very useful, thanks! I'll try and get it packaged to npm in the next couple of days
Sounds great. Thanks!
…On Wed, Jun 6, 2018, 4:47 PM Jac ***@***.***> wrote:
***@***.**** approved this pull request.
looks good and very useful, thanks! I'll try and get it packaged to npm in
the next couple of days
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#4 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAIJSkJg3kL6k121p_ITNjblDyUO58DUks5t6E39gaJpZM4UbEpB>
.
|
Add support for multiple delimiters for tokens that are not transformed. This was requested in bunkat#6, and it's something I need too.
Currently it's possible to specify a delimiter or delimiters to use as the start and end marker for a token that shouldn't be transformed. However, in my case there are multiple styles of tokens used in the strings that are being translated. For example:
%(variable)s
(named sprintf-style aka "Python-style")%s
(indexed sprintf-style aka "JavaScript-style")<someTag>
and</someTag>
(HTML/XML-style start and end tags)<SomeTag/>
(HTML/XML-style self-closing tags)Since the delimiters are treated as part of a RegExp, one workaround is to just include the start and end portions of each token in
options.startDelimiter
andoptions.endDelimiter
. However, that has a few problems:%(variable/>
could theoretically be identified as a token%s
) because they don't match the "start delimiter + name + end delimiter" requirement.This PR adds an additional way of specifying multiple delimiters that, when desired, are specifically matched in pairs to avoid problem (1). It also introduces an option (within the multiple delimiters) for a "full" token matcher that defines the full pattern for a token, which solves problem (2).
The PR adds an additional option,
delimiters
(note the "s"). Theoptions.delimiters
property accepts an array of "matcher" objects, each of which defines one style of token that should be excluded from pseudolocalization. For example:The types of matchers are:
{ start, end }
: specifies a pair of start/end delimiters to match. This is equivalent to usingstartDelimiter
andendDelimiter
for each pair{ both }
: specifies a single marker to use as both the start and end delimiters. This is like usingdelimiter
{ full }
: specifies a regular expression to use as the pattern for the entire token. This allows for other possibilities that don't work within the constraints of the "delimiter + name + delimiter" or "startDelimiter + name + endDelimiter" structure